feat: add out-of-tree and mcp based classification support #375

rootfs · 2025-10-08T20:04:47Z

What type of PR is this?

MCP Classification Test

Run MCP Server

cd  examples/mcp-classifier-server
python server.py --http --port 8090

Run Router

CONFIG_FILE=./config/config-mcp-classifier-example.yaml make run-router

Test prompt

make test-auto-prompt-reasoning

MCP Server log

$ python server.py --http --port 8090
2025-10-10 14:39:00,260 - __main__ - INFO - Starting Regex-Based MCP Classification Server (HTTP mode)
2025-10-10 14:39:00,260 - __main__ - INFO - Available categories: math, science, technology, history, general
2025-10-10 14:39:00,260 - __main__ - INFO - Listening on http://0.0.0.0:8090/mcp
2025-10-10 14:39:00,260 - __main__ - INFO - Server is ready at http://0.0.0.0:8090/mcp
2025-10-10 14:39:00,260 - __main__ - INFO - Health check available at http://0.0.0.0:8090/health
2025-10-10 14:39:03,900 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:03 +0000] "POST /mcp/initialize HTTP/1.1" 200 289 "-" "Go-http-client/1.1"
2025-10-10 14:39:03,900 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:03 +0000] "POST /mcp/tools/list HTTP/1.1" 200 1027 "-" "Go-http-client/1.1"
2025-10-10 14:39:03,901 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:03 +0000] "POST /mcp/resources/list HTTP/1.1" 404 269 "-" "Go-http-client/1.1"
2025-10-10 14:39:03,901 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:03 +0000] "POST /mcp/prompts/list HTTP/1.1" 404 266 "-" "Go-http-client/1.1"
2025-10-10 14:39:03,901 - __main__ - INFO - Returning 5 categories: ['math', 'science', 'technology', 'history', 'general']
2025-10-10 14:39:03,901 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:03 +0000] "POST /mcp/tools/call HTTP/1.1" 200 305 "-" "Go-http-client/1.1"
2025-10-10 14:39:04,712 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:04 +0000] "POST /mcp/initialize HTTP/1.1" 200 289 "-" "Go-http-client/1.1"
2025-10-10 14:39:04,713 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:04 +0000] "POST /mcp/tools/list HTTP/1.1" 200 1027 "-" "Go-http-client/1.1"
2025-10-10 14:39:04,713 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:04 +0000] "POST /mcp/resources/list HTTP/1.1" 404 269 "-" "Go-http-client/1.1"
2025-10-10 14:39:04,713 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:04 +0000] "POST /mcp/prompts/list HTTP/1.1" 404 266 "-" "Go-http-client/1.1"
2025-10-10 14:39:04,713 - __main__ - INFO - Returning 5 categories: ['math', 'science', 'technology', 'history', 'general']
2025-10-10 14:39:04,714 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:04 +0000] "POST /mcp/tools/call HTTP/1.1" 200 305 "-" "Go-http-client/1.1"
2025-10-10 14:39:07,331 - __main__ - INFO - Classifying text: 'What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?...' (with_probabilities=True)
2025-10-10 14:39:07,333 - __main__ - INFO - Classification result: class=4 (general), confidence=1.000, entropy=2.259, model=openai/gpt-oss-20b, use_reasoning=False
2025-10-10 14:39:07,333 - aiohttp.access - INFO - 127.0.0.1 [10/Oct/2025:14:39:07 +0000] "POST /mcp/tools/call HTTP/1.1" 200 401 "-" "Go-http-client/1.1"

Router log

{"level":"info","ts":"2025-10-10T14:39:03.771054169Z","caller":"observability/logging.go:140","msg":"Starting metrics server on :9190"}
{"level":"info","ts":"2025-10-10T14:39:03.771219608Z","caller":"candle-binding/semantic-router.go:276","msg":"Initializing BERT similarity model: sentence-transformers/all-MiniLM-L6-v2"}
{"level":"info","ts":"2025-10-10T14:39:03.898758668Z","caller":"observability/logging.go:140","msg":"Category descriptions: []"}
{"level":"info","ts":"2025-10-10T14:39:03.898845964Z","caller":"observability/logging.go:140","msg":"Semantic cache is disabled"}
{"level":"info","ts":"2025-10-10T14:39:03.898852414Z","caller":"observability/logging.go:140","msg":"Tools database is disabled"}
{"level":"info","ts":"2025-10-10T14:39:03.901446074Z","caller":"observability/logging.go:140","msg":"Auto-discovered classification tool: classify_text - Classify text into categories and provide intelligent routing recommendations. Categories: math, science, technology, history, general. Returns: class index, confidence, recommended model, and reasoning flag. Optionally returns full probability distribution for entropy analysis."}
{"level":"info","ts":"2025-10-10T14:39:03.901491086Z","caller":"observability/logging.go:140","msg":"Successfully initialized MCP category classifier with tool 'classify_text'"}
{"level":"info","ts":"2025-10-10T14:39:03.901506737Z","caller":"observability/logging.go:140","msg":"Loading category mapping from MCP server..."}
{"level":"info","ts":"2025-10-10T14:39:03.90189718Z","caller":"observability/logging.go:140","msg":"Loaded 5 categories from MCP server: [math science technology history general]"}
{"level":"info","ts":"2025-10-10T14:39:03.901919121Z","caller":"observability/logging.go:140","msg":"Successfully loaded 5 categories from MCP server"}
{"level":"info","ts":"2025-10-10T14:39:03.901929271Z","caller":"observability/logging.go:140","msg":"Successfully initialized MCP category classifier"}
{"level":"info","ts":"2025-10-10T14:39:03.90398565Z","caller":"observability/logging.go:140","msg":"Initializing LoRA models: Intent=models/lora_intent_classifier_bert-base-uncased_model, PII=models/lora_pii_detector_bert-base-uncased_model, Security=models/lora_jailbreak_classifier_bert-base-uncased_model, Architecture=bert"}
Detected BERT token classifier - using BERT naming
{"level":"info","ts":"2025-10-10T14:39:04.712141372Z","caller":"observability/logging.go:140","msg":"LoRA C bindings initialized successfully"}
{"level":"info","ts":"2025-10-10T14:39:04.712217447Z","caller":"observability/logging.go:140","msg":"Category mapping will be loaded from MCP server"}
{"level":"info","ts":"2025-10-10T14:39:04.713712523Z","caller":"observability/logging.go:140","msg":"Auto-discovered classification tool: classify_text - Classify text into categories and provide intelligent routing recommendations. Categories: math, science, technology, history, general. Returns: class index, confidence, recommended model, and reasoning flag. Optionally returns full probability distribution for entropy analysis."}
{"level":"info","ts":"2025-10-10T14:39:04.713732664Z","caller":"observability/logging.go:140","msg":"Successfully initialized MCP category classifier with tool 'classify_text'"}
{"level":"info","ts":"2025-10-10T14:39:04.713740905Z","caller":"observability/logging.go:140","msg":"Loading category mapping from MCP server..."}
{"level":"info","ts":"2025-10-10T14:39:04.714066793Z","caller":"observability/logging.go:140","msg":"Loaded 5 categories from MCP server: [math science technology history general]"}
{"level":"info","ts":"2025-10-10T14:39:04.714084094Z","caller":"observability/logging.go:140","msg":"Successfully loaded 5 categories from MCP server"}
{"level":"info","ts":"2025-10-10T14:39:04.714093105Z","caller":"observability/logging.go:140","msg":"Successfully initialized MCP category classifier"}
{"level":"info","ts":"2025-10-10T14:39:04.714104766Z","caller":"observability/logging.go:140","msg":"Router initialization: Using auto-discovered unified classifier"}
{"level":"info","ts":"2025-10-10T14:39:04.714140538Z","caller":"observability/logging.go:140","msg":"No categories configured for reasoning mode"}
{"level":"info","ts":"2025-10-10T14:39:04.714157289Z","caller":"observability/logging.go:140","msg":"Starting vLLM Semantic Router ExtProc with config: ./config/config-mcp-classifier-example.yaml"}
{"level":"info","ts":"2025-10-10T14:39:04.714236283Z","caller":"observability/logging.go:140","msg":"Starting insecure LLM Router ExtProc server on port 50051..."}
{"level":"info","ts":"2025-10-10T14:39:04.714253364Z","caller":"observability/logging.go:140","msg":"Starting Classification API server on port 8080"}
{"level":"info","ts":"2025-10-10T14:39:04.714743263Z","caller":"observability/logging.go:140","msg":"Found global classification service on attempt 1/5"}
{"level":"info","ts":"2025-10-10T14:39:04.71486373Z","caller":"observability/logging.go:140","msg":"System prompt configuration endpoints enabled"}
{"level":"info","ts":"2025-10-10T14:39:04.714896641Z","caller":"observability/logging.go:140","msg":"Classification API server listening on port 8080"}
{"level":"error","ts":"2025-10-10T14:39:04.714963605Z","caller":"observability/logging.go:142","msg":"Classification API server error: listen tcp :8080: bind: address already in use","stacktrace":"github.com/vllm-project/semantic-router/src/semantic-router/pkg/observability.Errorf\n\t/home/ubuntu/rootfs/back/semantic-router.bak/src/semantic-router/pkg/observability/logging.go:142\nmain.main.func4\n\t/home/ubuntu/rootfs/back/semantic-router.bak/src/semantic-router/cmd/main.go:118"}
{"level":"info","ts":"2025-10-10T14:39:07.329363567Z","caller":"observability/logging.go:140","msg":"Started processing a new request"}
{"level":"info","ts":"2025-10-10T14:39:07.329840825Z","caller":"observability/logging.go:140","msg":"Received request headers"}
{"level":"info","ts":"2025-10-10T14:39:07.331193903Z","caller":"observability/logging.go:140","msg":"Received request body {\"model\": \"auto\", \"messages\": [{\"role\": \"system\", \"content\": \"You are a professional math teacher. Explain math concepts clearly and show step-by-step solutions to problems.\"}, {\"role\": \"user\", \"content\": \"What is the derivative of f(x) = x^3 + 2x^2 - 5x + 7?\"}]}"}
{"level":"info","ts":"2025-10-10T14:39:07.331426576Z","caller":"observability/logging.go:140","msg":"Original model: auto"}
{"level":"info","ts":"2025-10-10T14:39:07.331471849Z","caller":"observability/logging.go:140","msg":"Using Auto Model Selection"}
{"level":"info","ts":"2025-10-10T14:39:07.33149091Z","caller":"observability/logging.go:140","msg":"Routing to model: openai/gpt-oss-20b"}
{"level":"info","ts":"2025-10-10T14:39:07.3337381Z","caller":"observability/logging.go:140","msg":"MCP classification result: class=4, confidence=1.0000, entropy_available=true"}
{"level":"info","ts":"2025-10-10T14:39:07.333780182Z","caller":"observability/logging.go:140","msg":"MCP classified as category: general (mmlu=general), reasoning_decision: use=true, confidence=0.300, reason=very_high_uncertainty_conservative_default"}
{"level":"info","ts":"2025-10-10T14:39:07.333794223Z","caller":"observability/logging.go:140","msg":"Entropy-based reasoning decision: category='general', confidence=1.000, use_reasoning=true, reason=very_high_uncertainty_conservative_default, strategy=high_uncertainty_reasoning_enabled"}
{"level":"info","ts":"2025-10-10T14:39:07.333810674Z","caller":"observability/logging.go:140","msg":"Top predicted categories: [{general 0.32} {math 0.2} {science 0.16}]"}
{"level":"info","ts":"2025-10-10T14:39:07.333820194Z","caller":"observability/logging.go:140","msg":"Entropy-based reasoning decision for this query: true on [openai/gpt-oss-20b] model (confidence: 0.300, reason: very_high_uncertainty_conservative_default)"}
{"level":"info","ts":"2025-10-10T14:39:07.333852206Z","caller":"observability/logging.go:140","msg":"Selected endpoint address: 127.0.0.1:8000 for model: openai/gpt-oss-20b"}
{"level":"info","ts":"2025-10-10T14:39:07.334146783Z","caller":"observability/logging.go:140","msg":"Applied reasoning mode (enabled: true) with effort (high) to model: openai/gpt-oss-20b"}
{"level":"info","ts":"2025-10-10T14:39:07.334175895Z","caller":"observability/logging.go:140","msg":"Use new model: openai/gpt-oss-20b"}
{"level":"info","ts":"2025-10-10T14:39:07.334194186Z","caller":"observability/logging.go:136","msg":"routing_decision","reason_code":"auto_routing","request_id":"e28a4bac-9fd5-4e5f-9539-2a49b2ca10a5","category":"general","reasoning_enabled":true,"reasoning_effort":"high","original_model":"auto","selected_model":"openai/gpt-oss-20b","selected_endpoint":"127.0.0.1:8000","routing_latency_ms":2,"event":"routing_decision"}
{"level":"info","ts":"2025-10-10T14:39:17.864804966Z","caller":"observability/logging.go:136","msg":"llm_usage","completion_latency_ms":10534,"cost":0,"currency":"unknown","event":"llm_usage","request_id":"e28a4bac-9fd5-4e5f-9539-2a49b2ca10a5","model":"openai/gpt-oss-20b","prompt_tokens":121,"completion_tokens":946,"total_tokens":1067,"pricing":"not_configured"}
{"level":"info","ts":"2025-10-10T14:39:17.864849128Z","caller":"observability/logging.go:140","msg":"Cache updated for request ID: e28a4bac-9fd5-4e5f-9539-2a49b2ca10a5"}
{"level":"info","ts":"2025-10-10T14:39:17.865304675Z","caller":"observability/logging.go:140","msg":"Stream canceled gracefully"}

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Release Notes: Yes/No

Signed-off-by: Huamin Chen <[email protected]>

netlify · 2025-10-08T20:04:52Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`66629c3`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68e91cc18ca5900008ea2f5b
😎 Deploy Preview	https://deploy-preview-375--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-10-08T20:04:59Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `config`

Owners: @rootfs
Files changed:

config/config-mcp-classifier-example.yaml

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

examples/mcp-classifier-server/README.md
examples/mcp-classifier-server/requirements.txt
examples/mcp-classifier-server/server.py

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/connectivity/mcp/api/types.go
src/semantic-router/pkg/connectivity/mcp/factory.go
src/semantic-router/pkg/connectivity/mcp/http_client.go
src/semantic-router/pkg/connectivity/mcp/interface.go
src/semantic-router/pkg/connectivity/mcp/stdio_client.go
src/semantic-router/pkg/connectivity/mcp/types.go
src/semantic-router/pkg/utils/classification/mcp_classifier.go
src/semantic-router/pkg/utils/classification/mcp_classifier_test.go
src/semantic-router/go.mod
src/semantic-router/go.sum
src/semantic-router/pkg/config/config.go
src/semantic-router/pkg/config/validation_test.go
src/semantic-router/pkg/extproc/test_utils_test.go
src/semantic-router/pkg/services/classification.go
src/semantic-router/pkg/utils/classification/classifier.go

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

rootfs · 2025-10-08T20:05:51Z

Will add a simple mcp classification server for testing next.

Copilot

Pull Request Overview

This PR introduces MCP (Model Context Protocol) based classification support as an alternative to in-tree classification models. It enables external classification services for category determination while maintaining the existing in-tree classifier as a fallback option.

Adds comprehensive MCP client implementation with stdio and HTTP transport support
Integrates MCP-based category classifier into the existing classification pipeline
Provides flexible configuration options for choosing between in-tree and MCP classification

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/semantic-router/pkg/utils/classification/mcp_classifier.go	Implements MCP-based category classification with full probability distribution support
src/semantic-router/pkg/utils/classification/classifier.go	Updates classifier to support both in-tree and MCP classification methods
src/semantic-router/pkg/connectivity/mcp/*.go	Complete MCP client implementation with factory pattern and multiple transport types
src/semantic-router/pkg/config/config.go	Adds configuration structure for MCP category classifier
src/semantic-router/go.mod	Adds mcp-go dependency for MCP protocol support
config/config-mcp-classifier-example.yaml	Provides example configuration for MCP-based classification

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-08T20:06:09Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+		reasoningDecision.Confidence,
+		reasoningDecision.UseReasoning,
+		reasoningDecision.DecisionReason,
+		topCategory,


The topCategory variable is being passed twice as arguments (lines 402 and 408) to RecordEntropyClassificationMetrics. This appears to be a copy-paste error where the second occurrence should likely be a different parameter.

Copilot · 2025-10-08T20:06:10Z

src/semantic-router/pkg/connectivity/mcp/stdio_client.go

+	// NOTE: Commented out - promptGuard is in BaseClient which is commented out
+	// if c.promptGuard != nil && c.promptGuard.IsEnabled() {
+	if false {
+		if err := (error)(nil); err != nil { // c.promptGuard.ValidateToolCall(name)


This disabled code block with hardcoded false condition and dummy error assignment should be removed or properly implemented. The commented-out functionality suggests incomplete security validation that could be important for production use.

Suggested change

// NOTE: Commented out - promptGuard is in BaseClient which is commented out

// if c.promptGuard != nil && c.promptGuard.IsEnabled() {

if false {

if err := (error)(nil); err != nil { // c.promptGuard.ValidateToolCall(name)

// Validate tool call using promptGuard if enabled

if c.promptGuard != nil && c.promptGuard.IsEnabled() {

if err := c.promptGuard.ValidateToolCall(name); err != nil {

Copilot · 2025-10-08T20:06:10Z

src/semantic-router/pkg/connectivity/mcp/http_client.go

+
+	// Try to initialize the connection
+	initRequest := mcp.InitializeRequest{}
+	initRequest.Params.ProtocolVersion = "2024-11-05"


The hardcoded protocol version should be defined as a constant to ensure consistency across the codebase and make version updates easier to manage.

Copilot · 2025-10-08T20:06:10Z

src/semantic-router/pkg/connectivity/mcp/types.go

+func ImageContentBlock(data, mimeType string) ContentBlock {
+	return ContentBlock{
+		Type: "image",
+		Text: data, // In practice, this would be base64 encoded data


The comment indicates this field should contain base64 encoded data, but the function parameter is named data without specifying the expected format. This could lead to incorrect usage where non-base64 data is passed.

Suggested change

func ImageContentBlock(data, mimeType string) ContentBlock {

return ContentBlock{

Type: "image",

Text: data, // In practice, this would be base64 encoded data

func ImageContentBlock(base64Data, mimeType string) ContentBlock {

return ContentBlock{

Type: "image",

Text: base64Data, // base64-encoded image data

Copilot · 2025-10-08T20:06:11Z

src/semantic-router/pkg/connectivity/mcp/client.go

+// Legacy types moved to interface.go - keeping for backward compatibility
+
+// NewLegacyClient creates a legacy MCP client wrapper (deprecated - use factory instead)
+func NewLegacyClient(name string, config ClientConfig) *Client {


The function is marked as deprecated but still implemented. Consider adding a deprecation warning log or removing this function entirely if the factory pattern is the preferred approach.

Suggested change

func NewLegacyClient(name string, config ClientConfig) *Client {

func NewLegacyClient(name string, config ClientConfig) *Client {

log.Printf("[DEPRECATION WARNING] NewLegacyClient is deprecated and will be removed in a future release. Please use the factory pattern instead.")

Copilot · 2025-10-08T20:06:11Z

src/semantic-router/pkg/utils/classification/classifier.go

+	}
+
+	// Add MCP classifier if configured (only if in-tree is not configured)
+	if cfg.Classifier.MCPCategoryModel.Enabled && cfg.Classifier.CategoryModel.ModelID == "" {


The condition that MCP classifier is only enabled when in-tree classifier is disabled (ModelID is empty) creates a mutually exclusive relationship that could be confusing. Consider making this relationship more explicit or allowing both to coexist with clear priority rules.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T01:48:56Z

src/semantic-router/pkg/connectivity/mcp/types.go

+	return authHeader
+}
+
+// SSEMessage represents a Server-Sent Events message


Incomplete documentation - the SSEMessage type is declared in the comment but the actual struct definition is missing from the file.

Suggested change

// SSEMessage represents a Server-Sent Events message

// SSEMessage represents a Server-Sent Events message

type SSEMessage struct {

Event string `json:"event,omitempty"`

Data interface{} `json:"data"`

ID string `json:"id,omitempty"`

Retry int `json:"retry,omitempty"`

}

Copilot · 2025-10-09T01:48:56Z

src/semantic-router/pkg/connectivity/mcp/stdio_client.go

+		// Start the client (required before Initialize)
+		// NOTE: Start() method removed in mcp-go v0.9.0 - connection starts automatically
+		// if err := c.mcpClient.Start(ctx); err != nil {
+		// 	c.log(LoggingLevelError, fmt.Sprintf("Failed to start MCP client: %v", err))
+		// 	initDone <- fmt.Errorf("failed to start MCP client: %w", err)
+		// 	return
+		// }


Large block of commented-out code should be removed. If this is important for version compatibility documentation, consider moving it to a separate comment explaining the API change.

Suggested change

// Start the client (required before Initialize)

// NOTE: Start() method removed in mcp-go v0.9.0 - connection starts automatically

// if err := c.mcpClient.Start(ctx); err != nil {

// c.log(LoggingLevelError, fmt.Sprintf("Failed to start MCP client: %v", err))

// initDone <- fmt.Errorf("failed to start MCP client: %w", err)

// return

// }

// Start() method removed in mcp-go v0.9.0 - connection starts automatically

Copilot · 2025-10-09T01:48:56Z

src/semantic-router/pkg/connectivity/mcp/http_client.go

+	initRequest.Params.ClientInfo = mcp.Implementation{
+		Name:    "http-mcp-client",
+		Version: "1.0.0",
+	}


Hardcoded version '1.0.0' should be extracted to a constant or configuration parameter to maintain consistency and enable easier version management.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T02:17:45Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+	// Set timeout if specified
+	if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {


Consider adding validation to ensure TimeoutSeconds is within reasonable bounds (e.g., greater than 0 and less than a maximum threshold) to prevent indefinite timeouts or zero-duration timeouts.

Suggested change

// Set timeout if specified

if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {

// Set timeout if specified and within reasonable bounds

const maxTimeoutSeconds = 300 // 5 minutes

if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {

if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > maxTimeoutSeconds {

return fmt.Errorf("TimeoutSeconds (%d) exceeds maximum allowed (%d seconds)", cfg.Classifier.MCPCategoryModel.TimeoutSeconds, maxTimeoutSeconds)

}

Copilot · 2025-10-09T02:17:46Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+	// Check threshold
+	threshold := c.Config.Classifier.MCPCategoryModel.Threshold
+	if threshold == 0 {
+		threshold = 0.5 // Default threshold


The default threshold value (0.5) is hardcoded in multiple places in the file (lines 264 and 345). Consider defining this as a constant to ensure consistency and easier maintenance.

Copilot · 2025-10-09T02:17:46Z

src/semantic-router/pkg/connectivity/mcp/stdio_client.go

+		}()
+
+		c.log(LoggingLevelDebug, fmt.Sprintf("Starting MCP client..."))
+		// Note: Start() method removed in mcp-go v0.9.0+ - connection starts automatically


This comment indicates the code is compatible with mcp-go v0.9.0+, but the go.mod file shows v0.42.0-beta.1. The comment should be updated to reflect the actual version being used or clarify the version compatibility.

Suggested change

// Note: Start() method removed in mcp-go v0.9.0+ - connection starts automatically

// Note: In mcp-go v0.42.0-beta.1 (current), connection starts automatically; Start() method is not used.

Copilot · 2025-10-09T02:17:46Z

src/semantic-router/pkg/connectivity/mcp/types.go

+		// The mcp.Content type doesn't have a Type field, so we need to use type assertion directly
+		firstContent := result.Content[0]
+
+		// Try to extract content based on the actual type
+		if textContent, ok := firstContent.(*mcp.TextContent); ok {
+			content = textContent.Text
+		} else if imageContent, ok := firstContent.(*mcp.ImageContent); ok {
+			content = fmt.Sprintf("Image: %s", imageContent.Data)
+		} else if resourceContent, ok := firstContent.(*mcp.EmbeddedResource); ok {
+			// The mcp.EmbeddedResource has Resource field but it's an interface, so we'll use string representation
+			content = fmt.Sprintf("Resource: %v", resourceContent.Resource)
+		} else {
+			// Fallback: try to get string representation


The type assertions for mcp.TextContent, mcp.ImageContent, and mcp.EmbeddedResource may not work correctly with the actual mcp library types. These types should be verified against the actual mcp-go library API to ensure proper type matching.

Suggested change

// The mcp.Content type doesn't have a Type field, so we need to use type assertion directly

firstContent := result.Content[0]

// Try to extract content based on the actual type

if textContent, ok := firstContent.(*mcp.TextContent); ok {

content = textContent.Text

} else if imageContent, ok := firstContent.(*mcp.ImageContent); ok {

content = fmt.Sprintf("Image: %s", imageContent.Data)

} else if resourceContent, ok := firstContent.(*mcp.EmbeddedResource); ok {

// The mcp.EmbeddedResource has Resource field but it's an interface, so we'll use string representation

content = fmt.Sprintf("Resource: %v", resourceContent.Resource)

} else {

// Fallback: try to get string representation

firstContent := result.Content[0]

// Use a type switch to match the actual types from mcp-go

switch c := firstContent.(type) {

case *mcp.TextContent:

content = c.Text

case mcp.TextContent:

content = c.Text

case *mcp.ImageContent:

content = fmt.Sprintf("Image: %s", c.Data)

case mcp.ImageContent:

content = fmt.Sprintf("Image: %s", c.Data)

case *mcp.EmbeddedResource:

content = fmt.Sprintf("Resource: %v", c.Resource)

case mcp.EmbeddedResource:

content = fmt.Sprintf("Resource: %v", c.Resource)

default:

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T02:25:43Z

src/semantic-router/pkg/connectivity/mcp/types.go

+	var arguments map[string]interface{}
+	if openAICall.Function.Arguments != "" {
+		if err := json.Unmarshal([]byte(openAICall.Function.Arguments), &arguments); err != nil {
+			return mcp.CallToolRequest{}, fmt.Errorf("failed to parse arguments: %w", err)


The error message for JSON parsing should provide more context about what failed to parse.

Suggested change

return mcp.CallToolRequest{}, fmt.Errorf("failed to parse arguments: %w", err)

argStr := openAICall.Function.Arguments

const maxLen = 200

if len(argStr) > maxLen {

argStr = argStr[:maxLen] + "...(truncated)"

}

return mcp.CallToolRequest{}, fmt.Errorf("failed to parse arguments (%q): %w", argStr, err)

Copilot · 2025-10-09T02:25:43Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+)
+
+const (
+	// DefaultMCPThreshold is the default confidence threshold for MCP classification


The DefaultMCPThreshold constant should have a comment explaining why 0.5 was chosen as the default value.

Suggested change

// DefaultMCPThreshold is the default confidence threshold for MCP classification

// DefaultMCPThreshold is the default confidence threshold for MCP classification.

// A value of 0.5 is commonly used as it represents a neutral cutoff for binary or probabilistic classification,

// meaning predictions with confidence above 50% are considered positive. Adjust as needed for your use case.

Copilot · 2025-10-09T02:25:43Z

src/semantic-router/pkg/utils/classification/classifier.go

+	// Note: Both in-tree and MCP classifiers can be configured simultaneously.
+	// At runtime, in-tree classifier will be tried first, with MCP as a fallback.
+	// This allows flexible deployment scenarios (e.g., gradual migration, A/B testing).


This important fallback behavior should also be documented in the function-level documentation, not just as inline comments.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T02:30:53Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+	// A value of 0.5 is commonly used as it represents a neutral cutoff for binary or probabilistic classification,
+	// meaning predictions with confidence above 50% are considered positive. Adjust as needed for your use case.


The comment refers to 'binary or probabilistic classification' but this appears to be for multi-class category classification. The explanation about '50% positive' is misleading for multi-class scenarios where 0.5 threshold means the predicted class needs >50% confidence.

Suggested change

// A value of 0.5 is commonly used as it represents a neutral cutoff for binary or probabilistic classification,

// meaning predictions with confidence above 50% are considered positive. Adjust as needed for your use case.

// For multi-class classification, a value of 0.5 means that a predicted class must have more than 50% confidence

// to be selected. Adjust this threshold as needed for your use case.

Copilot · 2025-10-09T02:30:54Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+		c.Config.Classifier.MCPCategoryModel.ToolName != "" &&
+		c.mcpCategoryInitializer != nil


The method checks if mcpCategoryInitializer is not nil, but this creates a circular dependency where the initializer must exist to determine if MCP is enabled. Consider separating configuration validation from runtime state checking.

Suggested change

c.Config.Classifier.MCPCategoryModel.ToolName != "" &&

c.mcpCategoryInitializer != nil

c.Config.Classifier.MCPCategoryModel.ToolName != ""

Copilot · 2025-10-09T02:30:54Z

src/semantic-router/pkg/connectivity/mcp/http_client.go

+	// MCPProtocolVersion is the MCP protocol version supported by this implementation
+	MCPProtocolVersion = "2024-11-05"


The hardcoded protocol version '2024-11-05' should be configurable or derived from the mcp-go library to ensure compatibility and easier updates.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T23:34:56Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+
+const (
+	// DefaultMCPThreshold is the default confidence threshold for MCP classification.
+	// For multi-class classification, a value of 0.5 means that a predicted class must have more than 50% confidence


The documentation incorrectly states 'more than 50% confidence' when the code uses >= comparison (result.Confidence < threshold). A threshold of 0.5 means the confidence must be 0.5 or higher, not more than 50%.

Suggested change

// For multi-class classification, a value of 0.5 means that a predicted class must have more than 50% confidence

// For multi-class classification, a value of 0.5 means that a predicted class must have 50% or higher confidence

Copilot · 2025-10-09T23:34:56Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+	if result.Confidence < threshold {
+		observability.Infof("MCP classification confidence (%.4f) below threshold (%.4f)",
+			result.Confidence, threshold)
+		return "", float64(result.Confidence), nil
+	}


The threshold comparison logic conflicts with the comment at line 22. The code correctly implements 'confidence must be >= threshold' but the comment suggests 'more than threshold'.

Copilot · 2025-10-09T23:34:56Z

src/semantic-router/pkg/utils/classification/classifier.go

+// NewClassifier creates a new classifier with model selection and jailbreak/PII detection capabilities.
+// Both in-tree and MCP classifiers can be configured simultaneously for category classification.
+// At runtime, in-tree classifier will be tried first, with MCP as a fallback,
+// allowing flexible deployment scenarios such as gradual migration or A/B testing.


The documentation mentions 'A/B testing' as a use case, but the current implementation doesn't support A/B testing - it only provides a fallback mechanism where in-tree is always tried first. A/B testing would require random selection between classifiers.

Signed-off-by: Huamin Chen <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/semantic-router/pkg/connectivity/mcp/types.go:1

The comment claims mcp-go doesn't export a version constant, but this should be verified. Consider using a constant from the mcp-go library if available to maintain consistency and avoid version mismatches.

package mcp

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T00:57:48Z

src/semantic-router/pkg/connectivity/mcp/stdio_client.go

+		}()
+
+		c.log(LoggingLevelDebug, fmt.Sprintf("Starting MCP client..."))
+		// Note: In mcp-go v0.42.0-beta.1, connection starts automatically; Start() method is not used


This version-specific comment may become outdated. Consider making this comment more generic or removing the specific version reference to improve maintainability.

Suggested change

// Note: In mcp-go v0.42.0-beta.1, connection starts automatically; Start() method is not used

// Note: Connection starts automatically; Start() method is not used

Copilot · 2025-10-10T00:57:48Z

examples/mcp-classifier-server/server.py

+        "--http", action="store_true", help="Run in HTTP mode instead of stdio"
+    )
+    parser.add_argument(
+        "--port", type=int, default=8090, help="HTTP port (default: 8090)"


[nitpick] The help text says "default: 8090" but this information is redundant since argparse automatically shows the default value. Consider simplifying to just "HTTP port".

Suggested change

"--port", type=int, default=8090, help="HTTP port (default: 8090)"

"--port", type=int, default=8090, help="HTTP port"

Signed-off-by: Huamin Chen <[email protected]>

Xunzhuo · 2025-10-10T09:17:30Z

config/config-mcp-classifier-example.yaml

+    transport_type: "http"           # HTTP transport
+    url: "http://localhost:8090/mcp" # MCP server endpoint
+
+    tool_name: "classify_text"       # MCP tool name to call


why should we specify the tool_name in client?

good point! the tool name is replaced by auto tool discovery now.

Xunzhuo · 2025-10-10T12:55:34Z

src/semantic-router/pkg/utils/classification/mcp_classifier.go

+	}
+
+	// Parse JSON response: {"class": int, "confidence": float, "model": str, "use_reasoning": bool}
+	var response struct {


so i think we rely on the mcp server to respect the format right?

should we move this into a standable api pkg? then the mcp server can export the pkg to follow the API.

sure, this one is moved to types.go

Signed-off-by: Huamin Chen <[email protected]>

rootfs · 2025-10-10T14:56:26Z

@Xunzhuo @wangchen615 PTAL. I am going to add the detailed MCP integration to the doc using the example server to explain how to add external classification support.

wangchen615

LGTM

Xunzhuo

great

feat: add out-of-tree and mcp based classification support

45a9792

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested review from Xunzhuo and wangchen615 as code owners October 8, 2025 20:04

rootfs requested a review from Copilot October 8, 2025 20:04

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Oct 8, 2025

rootfs marked this pull request as draft October 8, 2025 20:05

Copilot AI reviewed Oct 8, 2025

View reviewed changes

rootfs added 3 commits October 8, 2025 20:41

fix unit tests

2b43538

Signed-off-by: Huamin Chen <[email protected]>

update unit test

f7216a2

Signed-off-by: Huamin Chen <[email protected]>

review feedback

9dbe81d

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 9, 2025 01:48

Copilot AI reviewed Oct 9, 2025

View reviewed changes

review feedback

33cd307

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 9, 2025 02:16

Copilot AI reviewed Oct 9, 2025

View reviewed changes

review feedback

c935b5c

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 9, 2025 02:24

Copilot AI reviewed Oct 9, 2025

View reviewed changes

review feedback

d0284d3

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 9, 2025 02:29

Copilot AI reviewed Oct 9, 2025

View reviewed changes

review comments

75cd389

Signed-off-by: Huamin Chen <[email protected]>

rootfs requested a review from Copilot October 9, 2025 23:33

Copilot AI reviewed Oct 9, 2025

View reviewed changes

add example regex based classification mcp server

bb6ceb4

Signed-off-by: Huamin Chen <[email protected]>

rootfs force-pushed the fix-368 branch from 72031a5 to bb6ceb4 Compare October 10, 2025 00:56

rootfs requested a review from Copilot October 10, 2025 00:57

Copilot AI reviewed Oct 10, 2025

View reviewed changes

rootfs and others added 2 commits October 9, 2025 20:00

Merge branch 'main' into fix-368

3fb7cf1

review feedback

4482eb4

Signed-off-by: Huamin Chen <[email protected]>

rootfs marked this pull request as ready for review October 10, 2025 01:06

Xunzhuo reviewed Oct 10, 2025

View reviewed changes

rootfs and others added 6 commits October 10, 2025 13:44

review feedback: auto discover mcp tools

cef10c2

Signed-off-by: Huamin Chen <[email protected]>

verify fixes are working

30a4032

Signed-off-by: Huamin Chen <[email protected]>

verify fixes are working

e428b7a

Signed-off-by: Huamin Chen <[email protected]>

add missing file

cdbbf10

Signed-off-by: Huamin Chen <[email protected]>

fix lint

2a16d0b

Signed-off-by: Huamin Chen <[email protected]>

Merge branch 'main' into fix-368

66629c3

wangchen615 approved these changes Oct 10, 2025

View reviewed changes

Xunzhuo approved these changes Oct 10, 2025

View reviewed changes

rootfs merged commit ea0c6ea into vllm-project:main Oct 10, 2025
14 checks passed

	func NewLegacyClient(name string, config ClientConfig) *Client {
	func NewLegacyClient(name string, config ClientConfig) *Client {
	log.Printf("[DEPRECATION WARNING] NewLegacyClient is deprecated and will be removed in a future release. Please use the factory pattern instead.")

-// SSEMessage represents a Server-Sent Events message
+// SSEMessage represents a Server-Sent Events message
+type SSEMessage struct {
+	Event string      `json:"event,omitempty"`
+	Data  interface{} `json:"data"`
+	ID    string      `json:"id,omitempty"`
+	Retry int         `json:"retry,omitempty"`
+}

		// Set timeout if specified
		if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {

-	// Set timeout if specified
-	if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {
+	// Set timeout if specified and within reasonable bounds
+	const maxTimeoutSeconds = 300 // 5 minutes
+	if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > 0 {
+		if cfg.Classifier.MCPCategoryModel.TimeoutSeconds > maxTimeoutSeconds {
+			return fmt.Errorf("TimeoutSeconds (%d) exceeds maximum allowed (%d seconds)", cfg.Classifier.MCPCategoryModel.TimeoutSeconds, maxTimeoutSeconds)
+		}

	// Note: Start() method removed in mcp-go v0.9.0+ - connection starts automatically
	// Note: In mcp-go v0.42.0-beta.1 (current), connection starts automatically; Start() method is not used.

-			return mcp.CallToolRequest{}, fmt.Errorf("failed to parse arguments: %w", err)
+			argStr := openAICall.Function.Arguments
+			const maxLen = 200
+			if len(argStr) > maxLen {
+				argStr = argStr[:maxLen] + "...(truncated)"
+			}
+			return mcp.CallToolRequest{}, fmt.Errorf("failed to parse arguments (%q): %w", argStr, err)

		// A value of 0.5 is commonly used as it represents a neutral cutoff for binary or probabilistic classification,
		// meaning predictions with confidence above 50% are considered positive. Adjust as needed for your use case.

		c.Config.Classifier.MCPCategoryModel.ToolName != "" &&
		c.mcpCategoryInitializer != nil

		// MCPProtocolVersion is the MCP protocol version supported by this implementation
		MCPProtocolVersion = "2024-11-05"

	// For multi-class classification, a value of 0.5 means that a predicted class must have more than 50% confidence
	// For multi-class classification, a value of 0.5 means that a predicted class must have 50% or higher confidence

	// Note: In mcp-go v0.42.0-beta.1, connection starts automatically; Start() method is not used
	// Note: Connection starts automatically; Start() method is not used

	"--port", type=int, default=8090, help="HTTP port (default: 8090)"
	"--port", type=int, default=8090, help="HTTP port"

feat: add out-of-tree and mcp based classification support #375

feat: add out-of-tree and mcp based classification support #375

Uh oh!

Conversation

rootfs commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MCP Classification Test

Run MCP Server

Run Router

Test prompt

MCP Server log

Router log

Uh oh!

netlify bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 config

📁 Root Directory

📁 src

🎉 Thanks for your contributions!

Uh oh!

rootfs commented Oct 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

rootfs commented Oct 8, 2025 •

edited

Loading

netlify bot commented Oct 8, 2025 •

edited

Loading

github-actions bot commented Oct 8, 2025 •

edited

Loading

📁 `config`

📁 `Root Directory`

📁 `src`